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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/09/830 , 691A 



DATE: 03/19/2003 
TIME: 14:36:22 



Input Set : A:\Seqlist. txt 

Output Set: N:\CRF4\03192003\I830691A.raw 

4 <110> APPLICANT: Choi, Eui-Sung 

5 Rhee, Sang-Ki . 

6 Sohri; Jung-Hoon 

7 Park/ Soo-Dong 

8 Lee, Yoon-Hyoung 

9 Lee, Seung-Jae 

10 Jang, Jae-Kweon 

11 Choi, Seok-Keun 

12 Son, Young -Rok 

14 <120> TITLE OF INVENTION: VECTOR FOR THE TRANSFORMATION OF PHAFFIA 

15 RHODOZYMA AND PROCESS OF TRANSFORMATION THEREBY 
18 <130> FILE REFERENCE: 118.12-US-WO 

20 <140> CURRENT APPLICATION NUMBER: 09/830, 691A 

21 <141> CURRENT FILING DATE: 2001-04-26 

23 <150> PRIOR APPLICATION NUMBER: KR 1998/46547 

24 <151> PRIOR FILING DATE: 1998-10-31 

26 <150> PRIOR APPLICATION NUMBER: PCT/KR99/002 65 

27 <151> PRIOR FILING DATE: 1999-05-29 
29 <160> NUMBER OF SEQ ID NOS : 20 

31 <170> SOFTWARE: FastSEQ for Windows Version 4.0 

33 <210> SEQ ID NO: 1 

34 <211> LENGTH: 1223 

35 <212> TYPE: DNA 

36 <213> ORGANISM: Phaffia rhodozyma 

38 <400> SEQUENCE: 1 

39 atggtcaacg ttcccaagac tcgacgtgag ttatagcaat ttcaacaact ctccagacga 60 

40 caaatattcc agtgcatcga aagagtttgt ggataaacgc gacagtttca agggaaagag 120 

41 tcgatggaca gatttggaag acttagccgg tcaaggaact tggggatcac gtggcggagg 180 

42 actcatcaga agaagtcggg atttgtttga tcatagtggg atcaagacaa actggaggat 240 

43 atggctcgcc ttggaaggga atctccggcc tggattcgag gatccgaaag ttgtacgtat 300 

44 ggaaaagctt acacggcttg gatttattat ctttcatagg aacctactgc aagggtaagg 360 

45 cttgcaagaa gcacacgtaa gtcgcttatc ctctccactc tttcatggca tattgtcaac 420 

46 gactggacaa cgcgtccgtt ttgaaacaag tgacttacct gtgaaatttg attctacacc 4 80 

47 tgtatttagc cctcacaagg tacatatcac atcctcccac cccaccctgc ccaacttctt 540 

48 cagttcatct tgctctcggt ttccacattc cctgatgacc tccttgtatg ttctttgcga 600 

49 acgtttgttt ctgtttctgt aggtgaccca gtacaagaag ggaaaggact ccatcttcgc 660 

50 ccagggaaag cgacgatacg accgaaagca gtccggttac ggaggtcaga ccaagcccgt 720 

51 tttccacaag aaggctaaga ccaccaagaa ggtcgtcctt cgattggcgg tatttttgtt 780 

52 tattttgaat tctttttgtg tatgcagact tttgatgatt atgctcctct gtcgtttttt 840 

53 ctcttcaaac agagtgctcc gtctgcagtt cgttcttcct tccaaccaaa acttcaacta 900 

54 cagacatcat aaacagacat cttacttcgg tgttctctct ttttttccgc agagtacaag 960 

55 atgcagatga ccctcaagcg atgcaagcac ttcgagcttg gaggagacaa gaagaccaag 1020 

56 ggttcgtctt ttgtccatat attctctggt tcacttctta tgttcctaac gtacttgttt 1080 
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57 cctttttggt tcggatgttg tttctatcgg 


tggtgttttc 


ttttctttgg atgeattate 


1140 


58 atttatcgtg ttggactgtt ttcctctgct cgtttctttc 


tectctgtae ttgtgcttet 


1200 


59 caggagccgc catctctttc taa 


















1223 


61 <210> SEQ ID NO: 


2 
























62 <211> LENGTH: 350 
























63 <212> TYPE: 


DNA 


























64 <213> ORGANISM: 


Phaf f ia 


rhodozyma 
















66 <220> FEATURE: 


























67 <221> NAME/KEY: 


CDS 
























68 <222> LOCATION: 


(30) 


. . .(347) 




















70 <400> SEQUENCE: 


2 
























71 cccttcaagt ctcgtctcaa tcagtcaag 


atg 


gtc aac 


gtt 


ecc 


aacT 


act 


ccia 


53 


72 














Met 


Val Asn 


Val 


Pro 


Lys 


Thr 


Arg 




73 














1 






5 










75 cga acc 


tac 


tgc 


aag 


ggt 


aag 


get 


tac 


aag aag 


cac 


ace 


ect 


cac 


aaa 


101 


76 Arg Thr 


Tyr 


Cys 


Lys 


Gly 


Lys 


Ala 


Cys 


Lys Lys 


His 


Thr 


Pro 


His 


Lys 




77 10 










15 








20 












79 gtg acc 


cag 


tac 


aag 


aag 


gga 


aag 


crac 


tec ate 


ttc 


gee 


cag 


crcra 


aacr 


149 


80 Val Thr 


Gin 


Tyr 


Lys 


Lys 


Gly 


Lys 


Asp 


Ser He 


Phe 


Ala 


Gin 


Glv 


Lys 




81 25 








30 








35 










40 




83 cga cga 


tac 


gac 


cga 


aag 


cag 


tec 


ggt 


tac gga 


ggt 


cag 


acc 


aacr 


ecc 


197 


84 Arg Arg 


Tyr 


Asp 


Arg 


Lys 


Gin 


Ser 


Gly 


Tvr Glv 


Gly Gin 


Thr 


Lvs 


Pro 




85 






45 










50 








55 






87 gtt ttc 


cac 


aag 


aag 


get 


aag 


ace 


acc 


aag aag 


gtc 


gtc 


ctt 


cga 


ttg 


245 


88 Val Phe 


His 


Lys 


Lys 


Ala 


Lys 


Thr 


Thr 


Lys Lys 


Val 


Val 


Leu 


Arcr 


Leu 




89 




60 










65 








70 








91 gag tgc 


tec 


gtc 


tgc 


aag 


tac 


aag 


atg 


cag atg 


ace 


etc 


aag 


cga 


tgc 


293 


92 Glu Cys 


Ser 


Val 


Cys 


Lys 


Tyr 


Lys 


Met 


Gin Met 


Thr 


Leu 


Lvs 


Arcr 


Cys 




93 


75 










80 








85 










95 aag cac 


ttc 


gag 


ctt 


gga 


gga 


gac 


aag 


aag acc 


aag 


gga 


gee 


gee 


ate 


341 


96 Lys His 


Phe 


Glu 


Leu 


Gly 


Gly 


Asp 


Lys 


Lys Thr 


Lys 


Gly 


Ala 


Ala 


He 




97 90 










95 








100 












99 tot ttc 


taa 


























350 


100 Ser Phe 




























101 105 






























104 <210> SEQ ID NO 


: 3 
























105 <211> LENGTH: 106 
























106 <212> TYPE: 


PRT 


























107 <213> ORGANISM: 


Phaf f ia 


rhodozyma 
















109 <400> SEQUENCE: 


3 
























110 Met Val 


Asn 


Val 


Pro 


Lys 


Thr 


Arg 


■ Arg 


Thr Tyr 


Cys 


Lys 


Gly 


Lys 


Ala 




111 1 






5 










10 








15 






112 Cys Lys 


Lys 


His 


Thr 


Pro 


His 


Lys 


Val 


Thr Gin 


Tyr 


Lys 


Lys 


Gly 


Lys 




113 




20 










25 








30 








114 Asp Ser 


He 


Phe 


Ala 


Gin 


Gly 


Lys 


Arg 


Arg Tyr 


Asp 


Arg 


Lys 


Gin 


Ser 




115 


35 










40 








45 










116 Gly Tyr Gly Gly 


Gin 


Thr 


Lys 


Pro 


Val 


Phe His 


Lys 


Lys 


Ala 


Lys 


Thr 




117 50 










55 








60 












118 Thr Lys 


Lys 


Val 


Val 


Leu 


Arg 


Leu 


Glu 


Cys Ser 


Val 


Cys 


Lys 


Tyr 


Lys 
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119 
120 
121 
122 
123 



65 70 
Met Gin Met Thr Leu Lys 



75 

Arg Cys Lys His Phe 
90 

Ala lie Ser Phe 
105 



80 

Glu Leu Gly Gly Asp 
95 



85 

Lys Lys Thr Lys Gly Ala 
100 



126 <210> SEQ ID NO: 4 

127 <211> LENGTH: 741 

128 <212> TYPE: DNA 

129 <213> ORGANISM: Phaffia rhodozyma 

131 <220> FEATURE: 

132 <221> NAME/KEY: misc_f eature 

133 <222> LOCATION: (0)...(0) 

134 <223> OTHER INFORMATION: n=a, t, c, or g 

136 <400> SEQUENCE: 4 

137 ctcgagtgga cggtggcaat ggcattcgtg tcgttggtgc tcactcgcaa cccaagcagt 60 

138 cgcttacccg gggtagcctc cgggtgggcg cgatgatttg tggtgtggat tccttcccta 120 

139 tgggtagaac gacgcgcaac caatcattcg gagaaccgct ccgttgtagc cgaccagtct 180 

140 gattgatcaa catgccagca cgtcctccgg gacggagact ggcggggatc gtacctcatc 240 

141 tggaatcgct ggctcaatgg tagtagtctt cacgatcggc catgagggca gtctaggtgg 300 
W--> 14 2 gttcgcctgc cgaagactgt gtgagtgtgc tganaactaa ttgagtaccg ggggataagg 3 60 

143 caaggcgtgt ntggttgccg gtggctgtga gcgagtttgc tgcaaagcga ttcaatgcac 420 
14 4 cccggcttgg ccagcgcgct gcgtcacgaa acacactaaa cggttgacgc cataaagtaa 4 80 
14 5 taacacactc aagtttgtgg tcccgggtgg gcctctgtgc ctgcgtggga cccgacggga 540 

146 gaggaaaacg ttctgtggcc ctctcctctg tggatagtta cctggttgat cctgccagta 600 

147 gtcatatgct tgtctcaaag attaagccat gcatgtctaa gtataaacaa attcatactg 660 

148 tgaaactgcg aatggctcat taaatcagtt atagtttatt tgatggtacc ttgctacatg 720 
14 9 gataactgtg gtaattctag a 741 

151 <210> SEQ ID NO: 5 

152 <211> LENGTH: 23 

153 <212> TYPE: DNA 

154 <213> ORGANISM: Artificial Sequence 

156 <220> FEATURE: 

157 <223> OTHER INFORMATION: CYHl, a PCR primer for the cloning of L41 genomic 

158 DNA fragment 

W--> 160 <221> NAME/KEY: misc_f eature 

161 <222> LOCATION: (0)...(0) 

162 <223> OTHER INFORMATION: n=a, t, c, or g 
W--> 164 <400> 5 

W--> 165 cgcgtagtta aygtnccnaa rac 23 

167 <210> SEQ ID NO: 6 

168 <211> LENGTH: 25 

169 <212> TYPE: DNA 

170 <213> ORGANISM: Artificial Sequence 

172 <220> FEATURE: 

173 <223> OTHER INFORMATION: CYH3, a PCR primer for the cloning of L41 genomic 

174 DNA fragment 

176 <400> SEQUENCE: 6 

177 cccgggtytt ggcyttyttr tgraa 25 
179 <210> SEQ ID NO: 7 
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180 <211> LENGTH: 24 

181 <212> TYPE: DNA 

182 <213> ORGANISM: Artificial Sequence 

184 <220> FEATURE: 

185 <223> OTHER INFORMATION: 3' RACE primer 

187 <400> SEQUENCE: 7 

188 ggtcagacca agcaagtttt tcac 24 

190 <210> SEQ ID NO: 8 

191 <211> LENGTH: 24 

192 <212> TYPE: DNA 

193 <213> ORGANISM: Artificial Sequence 

195 <220> FEATURE: 

196 <22 3> OTHER INFORMATION: 5' RACE primer 

198 <400> SEQUENCE: 8 

199 gtgaaaaact tgcttggtct gacc 24 

201 <210> SEQ ID NO: 9 

202 <211> LENGTH: 24 

203 <212> TYPE: DNA 

204 <213> ORGANISM: Artificial Sequence 

206 <220> FEATURE: 

207 <223> OTHER INFORMATION: sense primer for the mutagenesis of L41 gene 

209 <400> SEQUENCE: 9 

210 ggtcagacca agcaagtttt tcac 24 

212 <210> SEQ ID NO: 10 

213 <211> LENGTH: 24 

214 <212> TYPE: DNA 

215 <213> ORGANISM: Artificial Sequence 

217 <220> FEATURE: 

218 <223> OTHER INFORMATION: antisense primer for the mutagenesis of L41 gene 

220 <400> SEQUENCE: 10 

221 gtgaaaaact tgcttggtct gacc 24 

223 <210> SEQ ID NO: 11 

224 <211> LENGTH: 20 

225 <212> TYPE: DNA 

226 <213> ORGANISM: Artificial Sequence 

228 <220> FEATURE: 

229 <223> OTHER INFORMATION: a PCR primer corresponding to 18S rDNA 

231 <400> SEQUENCE: 11 

232 tcctagtaag cgcaagtcat 20 

234 <210> SEQ ID NO: 12 

235 <211> LENGTH: 20 

236 <212> TYPE: DNA 

237 <213> ORGANISM: Artificial Sequence 

239 <220> FEATURE: 

240 <223> OTHER INFORMATION: a PCR primer corresponding to 18S rDNA 
242 <400> SEQUENCE: 12 

24 3 ttcggccaag gaaagaaact 20 

245 <210> SEQ ID NO: 13 

246 <211> LENGTH: 20 
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247 <212> TYPE: DNA 

248 <213> ORGANISM: Artificial Sequence 

250 <220> FEATURE: 

251 <223> OTHER INFORMATION: a PCR primer corresponding to 28S rDNA 

253 <400> SEQUENCE: 13 

254 aatcggatta tccggagcta 20 

256 <210> SEQ ID NO: 14 

257 <211> LENGTH: 20 

258 <212> TYPE: DNA 

259 <213> ORGANISM: Artificial Sequence 

261 <220> FEATURE: 

262 <223> OTHER INFORMATION: a PCR primer corresponding to 28S rDNA 

264 <400> SEQUENCE: 14 

265 gctataacac atccggagat 20 

267 <210> SEQ ID NO: 15 

268 <211> LENGTH: 2192 

269 <212> TYPE: DNA 

270 <213> ORGANISM: Phaffia rhodozyma 

272 <400> SEQUENCE: 15 

273 aagagctatt tgaatgacga ccacaagagt gacgatcata ttgagcatag tataccaaag 60 

274 gccaagaggc tgtgtggtgt tctatgagtg gccttgatta tgtgttacat aaataaactg 120 

275 atctcaattt ttcaaatact tgccaacact ttcatatatt cacaccaaaa aaagtcagat 180 

276 tggcccacaa agtcagatac acgctcgatc gtcgacgggt tcaagcactt tgtcaggcga 240 

277 aagaaaggcc acagcaccac ccttcaagtc tcgtctcaat caggttcgtc tagctttttg 300 

278 tgtgcaagga tttaccgtct tgatggattt gttcgttgaa agagaggaaa gaacatgctg 360 

279 aactgacgaa agtgtgaaca aaaaattgtg attttttcat tgtgtttcgc tggtctcctt 420 

280 gctgggttgg gttggatcgg atttatcttc tgtgttggat ggaaaaccct gaatgttctt 480 

281 ttcttggaca tcttctaaac tcgacaaaac gattcattcc tccgtactgc tctggttctg 54 0 

282 cctttttgaa tcgcatcgat aaattcttcc ctcggaacgt tcgatcaatc tccgtcaaac 600 

283 ttatcatcca aaaatctctt ctcgactgcc gccttgctcc ttttcttcgt tctttcctta 660 

284 atccgctttc gactaccctc cttctcttca cactcatagt caagatggtc aacgttccca 720 

285 agactcgacg tgagttatag caatttcaac aactctccag acgacaaata ttccagtgca 780 

286 tcgaaagagt ttgtggataa acgcgacagt ttcaagggaa agagtcgatg gacagatttg 840 

287 gaagacttag ccggtcaagg aacttgggga tcacgtggcg gaggactcat cagaagaagt 900 

288 cgggatttgt ttgatcatag tgggatcaag acaaactgga ggatatggct cgccttggaa 960 

289 gggaatctcc ggcctggatt cgaggatccg aaagttgtac gtatggaaaa gcttacacgg 1020 

290 cttggattta ttatctttca taggaaccta ctgcaagggt aaggcttgca agaagcacac 1080 

291 gtaagtcgct tatcctctcc actctttcat ggcatattgt caacgactgg acaacgcgtc 1140 

292 cgttttgaaa caagtgactt acctgtgaaa tttgattcta cacctgtatt tagccctcac 1200 

293 aaggtacata tcacatcctc ccaccccacc ctgcccaact tcttcagttc atcttgctct 1260 

294 cggtttccac attccctgat gacctccttg tatgttcttt gcgaacgttt gtttctgttt 1320 

295 ctgtaggtga cccagtacaa gaagggaaag gactccatct tcgcccaggg aaagcgacga 1380 

296 tacgaccgaa agcagtccgg ttacggaggt cagaccaagc ccgtttttca caagaaggct 1440 

297 aagaccacca agaaggtcgt ccttcgattg ggtacgtttt tgtttatttt gaattctttt 1500 

298 tgtgtatgca gacttttgat gattatgctc ctctgtcgtt ttttctcttc aaacagagtg 1560 

299 ctccgtctgc agttcgttct tccttccaac caaaacttca actacagaca tcataaacag 1620 

300 acatcttact tcggtgttct ctcttttttt ccgcagagta caagatgcag atgaccctca 1680 

301 agcgatgcaa gcacttcgag cttggaggag acaagaagac caagggttcg tcttttgtcc 1740 

302 atatattctc tggttcactt cttatgttcc taacgtactt gtttcctttt tggttcggat 1800 
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Please Note; 

Use of n and/or Xaa have been detected in the Sequence Listing. Please review the 
Sequence Listing to ensure that a corresponding explanation is presented in the <220> 
to <223> fields of each sequence which presents at least one n or Xaa, 

Seq#:4; N Pes. 334,371 
Seq#:5; N Pes. 15,18 
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VERIFICATION SUMMARY 

PATENT APPLICATION: US/09/830, 691A 



DATE: 03/19/2003 
TIME: 14:36:23 



Input Set : A: \Seqlist , txt 

Output Set: N:\CRF4\03192003\I830691A.raw 



L:142 M:341 W: (46) "n" or "Xaa" used, for SEQ IDtt : 4 after pos,:300 
M:341 Repeated in SeqNo=4 

L:160 M:281 W: Numeric Fields not Ordered, <221> Sort in ascending order! 
L:164 M:258 W: Mandatory Feature missing, <220> Tag not found for SEQ ID# : 5 
L:165 M:341 W: (46) "n" or "Xaa" used, for SEQ ID# : 5 after pos . : 0 
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